22 research outputs found

    Tailored Source Code Transformations to Synthesize Computationally Diverse Program Variants

    Get PDF
    The predictability of program execution provides attackers a rich source of knowledge who can exploit it to spy or remotely control the program. Moving target defense addresses this issue by constantly switching between many diverse variants of a program, which reduces the certainty that an attacker can have about the program execution. The effectiveness of this approach relies on the availability of a large number of software variants that exhibit different executions. However, current approaches rely on the natural diversity provided by off-the-shelf components, which is very limited. In this paper, we explore the automatic synthesis of large sets of program variants, called sosies. Sosies provide the same expected functionality as the original program, while exhibiting different executions. They are said to be computationally diverse. This work addresses two objectives: comparing different transformations for increasing the likelihood of sosie synthesis (densifying the search space for sosies); demonstrating computation diversity in synthesized sosies. We synthesized 30184 sosies in total, for 9 large, real-world, open source applications. For all these programs we identified one type of program analysis that systematically increases the density of sosies; we measured computation diversity for sosies of 3 programs and found diversity in method calls or data in more than 40% of sosies. This is a step towards controlled massive unpredictability of software

    DSpot: Test Amplification for Automatic Assessment of Computational Diversity

    Full text link
    Context: Computational diversity, i.e., the presence of a set of programs that all perform compatible services but that exhibit behavioral differences under certain conditions, is essential for fault tolerance and security. Objective: We aim at proposing an approach for automatically assessing the presence of computational diversity. In this work, computationally diverse variants are defined as (i) sharing the same API, (ii) behaving the same according to an input-output based specification (a test-suite) and (iii) exhibiting observable differences when they run outside the specified input space. Method: Our technique relies on test amplification. We propose source code transformations on test cases to explore the input domain and systematically sense the observation domain. We quantify computational diversity as the dissimilarity between observations on inputs that are outside the specified domain. Results: We run our experiments on 472 variants of 7 classes from open-source, large and thoroughly tested Java classes. Our test amplification multiplies by ten the number of input points in the test suite and is effective at detecting software diversity. Conclusion: The key insights of this study are: the systematic exploration of the observable output space of a class provides new insights about its degree of encapsulation; the behavioral diversity that we observe originates from areas of the code that are characterized by their flexibility (caching, checking, formatting, etc.).Comment: 12 page

    Identification d’une architecture à base de composants dans une application orientée objets à l’aide d’une analyse dynamique

    Full text link
    Un système, décrit avec un grand nombre d'éléments fortement interdépendants, est complexe, difficile à comprendre et à maintenir. Ainsi, une application orientée objet est souvent complexe, car elle contient des centaines de classes avec de nombreuses dépendances plus ou moins explicites. Une même application, utilisant le paradigme composant, contiendrait un plus petit nombre d'éléments, faiblement couplés entre eux et avec des interdépendances clairement définies. Ceci est dû au fait que le paradigme composant fournit une bonne représentation de haut niveau des systèmes complexes. Ainsi, ce paradigme peut être utilisé comme "espace de projection" des systèmes orientés objets. Une telle projection peut faciliter l'étape de compréhension d'un système, un pré-requis nécessaire avant toute activité de maintenance et/ou d'évolution. De plus, il est possible d'utiliser cette représentation, comme un modèle pour effectuer une restructuration complète d'une application orientée objets opérationnelle vers une application équivalente à base de composants tout aussi opérationnelle. Ainsi, La nouvelle application bénéficiant ainsi, de toutes les bonnes propriétés associées au paradigme composants. L'objectif de ma thèse est de proposer une méthode semi-automatique pour identifier une architecture à base de composants dans une application orientée objets. Cette architecture doit, non seulement aider à la compréhension de l'application originale, mais aussi simplifier la projection de cette dernière dans un modèle concret de composant. L'identification d'une architecture à base de composants est réalisée en trois grandes étapes: i) obtention des données nécessaires au processus d'identification. Elles correspondent aux dépendances entre les classes et sont obtenues avec une analyse dynamique de l'application cible. ii) identification des composants. Trois méthodes ont été explorées. La première utilise un treillis de Galois, la seconde deux méta-heuristiques et la dernière une méta-heuristique multi-objective. iii) identification de l'architecture à base de composants de l'application cible. Cela est fait en identifiant les interfaces requises et fournis pour chaque composant. Afin de valider ce processus d'identification, ainsi que les différents choix faits durant son développement, j'ai réalisé différentes études de cas. Enfin, je montre la faisabilité de la projection de l'architecture à base de composants identifiée vers un modèle concret de composants.A system is complex and particularly difficult to understand and to maintain when it is described with a large number of highly interdependent parties. An object-oriented application is often complex because it uses hundreds or thousands of classes with many different dependencies more or less explicit. The same application, using the component paradigm, contains a smaller number of loosely coupled parties, highly cohesive with clear inter-dependencies. Indeed, because the component paradigm provides a high-level representation, synthetic and well-organized structure of complex systems, it can provide a space of projection for object-oriented applications. Such projection facilitates the step of understanding a system prior to any activity of maintenance and/or evolution. In addition, it is possible to use this representation as a model to perform a complete restructuring of an operational object-oriented application into its equivalent operational component-based application. Thus, the new form of the application benefits from all the good properties associated with the component-oriented paradigm. The goal of my thesis is to propose a semi-automatic approach to identify a component-based architecture in an object-oriented application. This architecture should help in understanding the original application, but also simplifies the projection of the object-oriented application on a concrete component model. The identification of a component-based architecture is achieved in three main steps: i) obtaining data for the identification process. These data, which correspond to dependencies between classes, are obtained with a dynamic analysis of the target application. ii) identification of the components. Three methods were explored. The first uses the formal concept analysis, the second two meta-heuristics and the last a multiobjective meta-heuristic. iii) identification of the component-based architecture representing the target application. This is done by identifying the provided and required interfaces for each component. To validate this identification process, and the different choices made during its development, I realized several case studies. Finally, I show the feasibility of the projection of the identified component-based architecture on a specific component model

    Assessing Product Line Derivation Operators Applied to Java Source Code: An Empirical Study

    Get PDF
    International audienceProduct Derivation is a key activity in Software Product Line Engineering. During this process, derivation operators modify or create core assets (e.g., model elements, source code instructions, components) by adding, removing or substituting them according to a given configuration. The result is a derived product that generally needs to conform to a programming or modeling language. Some operators lead to invalid products when applied to certain assets, some others do not; knowing this in advance can help to better use them, however this is challenging, specially if we consider assets expressed in extensive and complex languages such as Java. In this paper, we empirically answer the following question: which product line operators, applied to which program elements , can synthesize variants of programs that are incorrect , correct or perhaps even conforming to test suites? We implement source code transformations, based on the derivation operators of the Common Variability Language. We automatically synthesize more than 370,000 program variants from a set of 8 real large Java projects (up to 85,000 lines of code), obtaining an extensive panorama of the sanity of the operations

    Domain Specific Warnings: Are They Any Better?

    Get PDF
    Abstract—Tools to detect coding standard violations in source code are commonly used to improve code quality. One of their original goals is to prevent bugs, yet, a high number of false positives is generated by the rules of these tools, i.e., most warnings do not indicate real bugs. There are empirical evidences supporting the intuition that the rules enforced by such tools do not prevent the introduction of bugs in software. This may occur because the rules are too generic and do not focus on domain specific problems of the software under analysis. We underwent an investigation of rules created for a specific domain based on expert opinion to understand if such rules are worthwhile enforcing in the context of defect prevention. In this paper, we performed a systematic study to investigate the relation between generic and domain specific warnings and observed defects. From our experiment on a real case, long term evolution, software, we have found that domain specific rules provide better defect prevention than generic ones. I

    A Framework to Compare Alert Ranking Algorithms

    Get PDF
    Abstract—To improve software quality, rule checkers statically check if a software contains violations of good programming practices. On a real sized system, the alerts (rule violations detected by the tool) may be numbered by the thousands. Unfortunately, these tools generate a high proportion of “false alerts”, which in the context of a specific software, should not be fixed. Huge numbers of false alerts may render impossible the finding and correction of “true alerts ” and dissuade developers from using these tools. In order to overcome this problem, the literature provides different ranking methods that aim at computing the probability of an alert being a “true one”. In this paper, we propose a framework for comparing these ranking algorithms and identify the best approach to rank alerts. We have selected six algorithms described in literature. For comparison, we use a benchmark covering two programming languages (Java and Smalltalk) and three rule checkers (FindBug, PMD, SmallLint). Results show that the best ranking methods are based on the history of past alerts and their location. We could not identify any significant advantage in using statistical tools such as linear regression or Bayesian networks or ad-hoc methods

    From Object-Oriented Applications to Component-Oriented Applications via Component-Oriented Architecture

    Get PDF
    10 pagesInternational audienceObject-oriented applications of significant size are often complex and therefore costly to maintain. Indeed, they rely on the concept of class which has low granularity with varied dependencies not always explicit. The component paradigm provides a projection space well-structured and of highest level for a better understanding through abstract architectural views. But it is possible to go further. It may also be the ultimate target of a complete process of re engineering. The end-to-end automation of this process is a subject on which literature has made very little attention. In this paper, we propose such a method to automatically transform an object-oriented application in an operational component-oriented application. We illustrate this method on a real Java application which is transformed in an operational OSGi application

    Automatic Software Diversity in the Light of Test Suites

    No full text
    11 pages, 4 figures, 8 listings, conferenceA few works address the challenge of automating software diversification, and they all share one core idea: using automated test suites to drive diversification. However, there is is lack of solid understanding of how test suites, programs and transformations interact one with another in this process. We explore this intricate interplay in the context of a specific diversification technique called "sosiefication". Sosiefication generates sosie programs, i.e., variants of a program in which some statements are deleted, added or replaced but still pass the test suite of the original program. Our investigation of the influence of test suites on sosiefication exploits the following observation: test suites cover the different regions of programs in very unequal ways. Hence, we hypothesize that sosie synthesis has different performances on a statement that is covered by one hundred test case and on a statement that is covered by a single test case. We synthesize 24583 sosies on 6 popular open-source Java programs. Our results show that there are two dimensions for diversification. The first one lies in the specification: the more test cases cover a statement, the more difficult it is to synthesize sosies. Yet, to our surprise, we are also able to synthesize sosies on highly tested statements (up to 600 test cases), which indicates an intrinsic property of the programs we study. The second dimension is in the code: we manually explore dozens of sosies and characterize new types of forgiving code regions that are prone to diversification
    corecore